Dropping the unnamed column
NullValueCheck
Duplicate value Check and Deletion

From the above, it is clearly evident that all the columns of the data frame are of the type object.

Checking the Statistical Description

Finding the unique values of all the columns of the dataframe

Replacing and Labelling the values of the columns

Changing the date column to date format

Extracting the month, year and day column to the data.

Function to create month into seasons

Univariate analysis covering Count, Pie, Scatter, Histogram plots

1.Country

----- From the above plots, we can conclude the following

  1. The country_01 has a count of about 248. Country _02 has a count of about 129. Country_03 has a count of about 41.

  2. From the above pie chart, it can be infered that the country _01 is the most affected country with about 59% accidents and country_03 is the least affected country.

  3. From the above output, the country_01 has maximum accidents and country_03 has minimum accidents.

2.Local
  1. Most of the accidents happen in Local_03 with a count of about 89.
  2. Maximum accidents are taken place in local_03 with 21.18% and least accidents are taken place in local_09 and local_11 with 0.47%.
  3. From the above output, it can be infered that the local_03 is more prone to accidents.
  4. From the above histogram, it can be observed that the number of accidents in Local_03 are about 90.
3.Industry sector
  1. From the above, it is evident that the mining is prone to more accidents with about 237.
  2. It can be observed that the mostly affected sector is Mining sector. 56.71% of accidents occur in Mining sector.
  3. It is clearly evident that the mining is prone to more accidents
4.Accident level
  1. From the above count plot, it is clearly evident that the most accidents belongs to "Accident Level" "1" with a count of about more than 300.
  2. From the above pie chart, it can be determined that the maximum accidents are of level 1 equivalent to about 73.9%
5.potential accident level
  1. From the above count plot, it can be determined that the most "Potential Accident Level" belongs to level IV with a count of about 141.
  2. From the above pie chart, it is evident that most "Potential Accident Level" belongs to level IV with 33.7%.
6.Gender
  1. From the above plot, it is evident that Most affected wokers in accidents are male with a count of 396.
  2. From the above pie chart, it is evient that most affected wokers in accidents are male.
7.Nature of the employee

From the above it can be determined that the employee type of Third party are prone to accidents.

8.Critical risk

When we count the number of incidents by each type of critical risk, Others tops the list.

9.Year

From the above, it is clearly evident that most accidents happend in year 2016. i.e- more than 250.

10.Month
  1. From the above, it can be determined that the most accidents are of month Feb.
  2. From the above, it is evident that most of the accidents happened in feb equivalent to 14.6%.
11.WeekDay
  1. From the above count plot, it can be determined that max accidents happened on thursday with approximately 76 accidents.
  2. From the above piechart, it is evident that most accidents happend in Thursday equivalent to 18.2%

Bivariate analysis

Gender vs RestAll

1.Gender vs Accident level

From the above count plot, it can be determined that the most of the accidents happened at level I with gender male.

2.Gender vs Potential Accident Level

From the above,it can be determined that most of the potential level accidents happened to male compared to female, of which Potential Accident Level of IV is dominant

3.Gender vs Country

From the above countplot, it can be determined that the maximum number of accidents took place in country_01 to males and they are about 241.

4.Gender vs Industry Sector

From the above count plot, it is evident that most of the accidents happened to Male in the mining sector, around 232.

5.Gender vs Year

From the above countplot, it is clearly evident that maximum accidents took place in 2016 to the male when compared to female with a count of 269.

6.Gender vs month

From the above count plot, it is determined that maximum number of accidents happened to male in the month feb with a count 57.

7.Gender vs weekday

Max accidents happened to male on thursday with a count of more than 73

8. Gender vs Nature of Employee

From the above output, it is clearly evident that maximum accidents happened to third party male employees. i.e- 176.

9. Gender vs Critical Risk

Critical Risk of type "Others" is dominant across both Male and Female Genders

Industry sector vs RestAll

1. Industry Sector Vs Accident Level

Maximum number of accidents happened in the mining sector with accident Level I. i.e- 163.

2.Industry sector vs potential accident level

Maximum number of accidents happened in the potential accident level 4 and mining sector with a count 99. Minimum number of accidents took place in the mining sector at a potential accident level 6.

3.Industry Sector vs Critical Risk

From the above count plot, it is evident that maximum number of accidents happened in mining with a critical risk of others. i.e- about 175

4.Industry sector vs Local

Many accidents happened with a local 3 and industrial sector mining. i.e- more than 80. Least accidents took place with local 11 and industrial sector others.

5. Industry Sector Vs Year

From the above plot, the following could be determined

1.The number of accidents taken place in year 2016 for mining sector is 160.

2.The number of accidents taken place in year 2016 wrt metals sector is about 100.

3.The number of accidents taken place in the year 2016 wrt others sector is about 30. Hence, it can be determined that maximum accidents took place in mining sector in the year 2016.

4.The number of accidents taken place in the year 2017 wrt mining sector is 80.

5.The number of accidents taken place in the year 2017 wrt metals sector is about 40.

6.The number of accidents taken place in the year 2017 wrt others sector is 20.

Hence, it can be determined that max accidents took place in mining sector in the year 2017

6. Industry Sector Vs Month

Maximum number of accidents happened in the month feb and mining sector. The least number of accidents took place in the others sector and month december.

7.Industry sector vs Weekday

Maximum number of accidents hapenned on the day saturday in the mining sector. i.e- more than 40. The least number of accidents happened on the day sunday in the others sector.

8. Industry sector Vs country

From the above count plot, it is evident that the maximum number of accidents took place in country_01 and mining sector.i.e- 200. The least number of accidents took place in country _01 and others sector.

9.Industry Sector Vs nature of employee

From the above count plot, it is clearly evident that the maximum accidents took place in the mining sector with the third party employee type. i.e- about 120. The least number of accidents took place in the others sectors with the nature of employee as employee.

Country vs RestAll

1. Country vs Year

From the above output, the following can be determined-

1.The number of accidents taken place in country_01 and year 2016 is 174.

2.The number of accidents taken place in country_01 and year 2017 is about 74.

3.The number of accidents taken place in country_02 and year 2016 is more than 86.

4.The number of accidents taken place in country_02 and year 2017 is about 43.

5.The number of accidents taken place in country_03 and year 2016 is about 23.

6.The number of accidents taken place in country_03 and year 2017 is about 18.

2. Country Vs accident level

From the above count plot, it is clearly evident that the maximum number of accidents took place in accident level 1 and country_01.

3. Country Vs Potential Accident Level

From the above plot, it is evident that the maximum accidents occurred in country_01 and potential accident level 3.

4. Country Vs Local

Country 1 is more dominant in local 3 region and least dominant in Local 12

5.Country Vs Nature Of Employee

Accidents in Country 01 is more dominant in Third Party type of employee, country 03 is least dominant in Third Party (Remote)

6. Country Vs Critical Risk

Country 01 is more dominant in Others Critical Risk and Critical Risk is least dominant in Country 03

Local Vs Rest All

1. Local Vs Accident Level

Accident level 1 is more dominant in Local 2 region with 65 accidents, while Accident Level V is least across all Locals

2. Local Vs Potential Accident Level

Overall Local 3 is more prone to Multiple potential accidents, while local 12 is the least

3. Local Vs Natureofemployee

Type Employee is more dominant across all Locals, while Type Third Party(Remote) is least dominant across all Locals

4. Local Vs Critical Risk

Critical Risk of type "Others" is dominant across all Locals

5. Local Vs Year

Year 2016 has more accidents across all Local regions compared to 2017

Accident Level Vs Rest All

1. Accident Level Vs Potential Accident Level

Accident Level I is more related to Potential Accident levels of I, II, III, IV, V, VI

2. Accident Level Vs Natureofemployee

Accident Level I is more dominant across all Employee types, where Level V is least across all types

3. Accident Level Vs Critical Risk

Accident Level I is more domaint with Other critical Risk type

4. Accident Level Vs Year

Accident Level I is more dominant in across 2016 and 2017 years, and Level V is minimum

5. Accident Level Vs Month

Accident Level 1 dominates across all Months while Level V is minimum

6. Accident Level Vs Country

Accident Level I is more dominant across all Countries, while Accident Level V is least dominant across all countries

Potential Accident Level Vs Rest All(Remaining uncoverd)

1. Potential Accident Level Vs Natureofemployee

Potential Accident level IV dominents in ThirdParty, while VI is least dominant in Third Party(Remote) across all

2. Potential Accident Level Vs Critical Risk

Among all Critical Risk with Type as "Others" is dominant across all Potential Accident Levels

3. Potential Accident Level Vs Year

There is Decrease in Number of accidents across all Potential Accident level from 2016 to 2017. Potential Accident level IV is dominant in both 2016 and 2017

4. Potential Accident Level Vs Country

Potential Accident Level IV is dominant across all countries, while with VI least number of accidents happenned

Natureofemployee Vs RestAll

1. Natureofemployee Vs Critical Risk

Critical Risk of type "Others" is dominant across all Types of Employees

Year Vs RestAll

1. Year vs month

From the above plot, it is evident that the max accidents happened in the year 2016 and march.

2. Year vs Local

From the above plot, it can be determined that the maximum accidents took place in the local 3 and year 2016.

3. Year vs Weekday

From the above plot, it is clearly evident that maximum number of accidents took place on thursday and year 2016.

4.Year vs Critical Risk

From the above plot, it is clearly evident that maximum number of accidents took place with "Others" and year 2016.

Multivariate analysis

From the above Correlation diagram its clear that "Local" and "Year" are moderately correlated

**Groupby Analysis**

Year wise distribution of accidents and potential accident levels

  1. Year 2016 with Accident Level I has maximum accidents of 64 with Potential Accident Level III and 62 with Potential Accident Level II

  2. Year 2017 with Accident Level I has maximum accidents of 26 with Potential Accident Level II and 25 with Potential Accident Level III,IV

Year wise distribution of Industry Sector and accident levels

  1. Year 2016 with Industry Sector of Type "Metals" has maximum accidents of 79 with Accident Level I
  2. Year 2016 with Industry Sector of Type "Mining" has maximum accidents of 112 with Accident Level I
  3. Year 2016 with Industry Sector of Type "Others" has maximum accidents of 20 with Accident Level I

  4. Year 2017 with Industry Sector of Type "Metals" has maximum accidents of 28 with Accident Level I

  5. Year 2017 with Industry Sector of Type "Mining" has maximum accidents of 51 with Accident Level I
  6. Year 2017 with Industry Sector of Type "Others" has maximum accidents of 19 with Accident Level I

Industry Sector wise distribution of Country and accident levels

  1. Metals in Country_01 has maximum accidents with Level 1 with 36 count
  2. Metals in Country_02 has maximum accidents with Level 1 with 71 count
  3. Mining in Country_01 has maximum accidents with Level 1 with 140 count
  4. Mining in Country_02 has maximum accidents with Level 1 with 23 count
  5. Others in Country_03 has maximum accidents with Level 1 with 34 count

Word Cloud Analysis

WordCloud for Accident Level and Description

WordCloud for Potential Accident Level and Description

WordCloud for Industry Sector and Description

WordCloud for Country and Description